AITopics

2604.20492

Country:

Europe > Austria > Vienna (0.14)
Europe > France (0.05)
Oceania > French Polynesia (0.04)
(10 more...)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Daunas, Francisco, Esnaola, Iñaki, Perlaza, Samir M.

A Dual Optimization View to Empirical Risk Minimization with f-Divergence Regularization

arXiv.org Machine LearningAug-6-2025

--The dual formulation of empirical risk minimization with f -divergence regularization (ERM-f DR) is introduced. The solution of the dual optimization problem to the ERM-f DR is connected to the notion of normalization function introduced as an implicit function. This dual approach leverages the Legendre-Fenchel transform and the implicit function theorem to provide a nonlinear ODE expression to the normalization function. Furthermore, the nonlinear ODE expression and its properties provide a computationally efficient method to calculate the normalization function of the ERM-f DR solution under a mild condition. Empirical risk minimization (ERM) [1]-[6] is often posed as an optimization problem regularized by a statistical distance between the probability measure to be optimized and a given reference measure [7]-[13].

artificial intelligence, machine learning, normalization function, (15 more...)

2508.03314

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > France > Provence-Alpes-Côte d'Azur (0.04)
Oceania > French Polynesia (0.04)
(9 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.56)

arXiv.org Artificial IntelligenceMar-6-2025

Dedicated Feedback and Edit Models Empower Inference-Time Scaling for Open-Ended General-Domain Tasks

Wang, Zhilin, Zeng, Jiaqi, Delalleau, Olivier, Egert, Daniel, Evans, Ellie, Shin, Hoo-Chang, Soares, Felipe, Dong, Yi, Kuchaiev, Oleksii

Inference-Time Scaling has been critical to the success of recent models such as OpenAI o1 and DeepSeek R1. However, many techniques used to train models for inference-time scaling require tasks to have answers that can be verified, limiting their application to domains such as math, coding and logical reasoning. We take inspiration from how humans make first attempts, ask for detailed feedback from others and make improvements based on such feedback across a wide spectrum of open-ended endeavors. To this end, we collect data for and train dedicated Feedback and Edit Models that are capable of performing inference-time scaling for open-ended general-domain tasks. In our setup, one model generates an initial response, which are given feedback by a second model, that are then used by a third model to edit the response. We show that performance on Arena Hard, a benchmark strongly predictive of Chatbot Arena Elo can be boosted by scaling the number of initial response drafts, effective feedback and edited responses. When scaled optimally, our setup based on 70B models from the Llama 3 family can reach SoTA performance on Arena Hard at 92.7 as of 5 Mar 2025, surpassing OpenAI o1-preview-2024-09-12 with 90.4 and DeepSeek R1 with 92.3.

annotator, preprint, zhang, (15 more...)

2503.04378

Country:

Oceania > French Polynesia (0.04)
North America > United States > Nevada > Clark County > Las Vegas (0.04)
North America > United States > Hawaii (0.04)
(12 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine (0.68)
Leisure & Entertainment > Social Events (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.45)

Perlaza, Samir M., Bisson, Gaetan

Variations on the Expectation Due to Changes in the Probability Measure

arXiv.org Artificial IntelligenceFeb-4-2025

Closed-form expressions are presented for the variation of the expectation of a given function due to changes in the probability measure used for the expectation. They unveil interesting connections with Gibbs probability measures, the mutual information, and the lautum information.

artificial intelligence, machine learning, probability measure, (16 more...)

2502.02887

Country:

Oceania > French Polynesia (0.04)
North America > United States (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(4 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.97)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.71)

Bermudez, Yaiza, Bisson, Gaetan, Esnaola, Iñaki, Perlaza, Samir M.

Proofs for Folklore Theorems on the Radon-Nikodym Derivative

arXiv.org Machine LearningJan-30-2025

Rigorous statements and formal proofs are presented for both foundational and advanced folklore theorems on the Radon-Nikodym derivative. The cases of product and marginal measures are carefully considered; and the hypothesis under which the statements hold are rigorously enumerated.

equality, information theory, theorem, (13 more...)

2501.18374

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.15)
North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
(12 more...)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.69)

arXiv.org Artificial IntelligenceDec-6-2024

CompCap: Improving Multimodal Large Language Models with Composite Captions

Chen, Xiaohui, Shukla, Satya Narayan, Azab, Mahmoud, Singh, Aashu, Wang, Qifan, Yang, David, Peng, ShengYun, Yu, Hanchao, Yan, Shen, Zhang, Xuewen, He, Baosheng

How well can Multimodal Large Language Models (MLLMs) understand composite images? Composite images (CIs) are synthetic visuals created by merging multiple visual elements, such as charts, posters, or screenshots, rather than being captured directly by a camera. While CIs are prevalent in real-world applications, recent MLLM developments have primarily focused on interpreting natural images (NIs). Our research reveals that current MLLMs face significant challenges in accurately understanding CIs, often struggling to extract information or perform complex reasoning based on these images. We find that existing training data for CIs are mostly formatted for question-answer tasks (e.g., in datasets like ChartQA and ScienceQA), while high-quality image-caption datasets, critical for robust vision-language alignment, are only available for NIs. To bridge this gap, we introduce Composite Captions (CompCap), a flexible framework that leverages Large Language Models (LLMs) and automation tools to synthesize CIs with accurate and detailed captions. Using CompCap, we curate CompCap-118K, a dataset containing 118K image-caption pairs across six CI types. We validate the effectiveness of CompCap-118K by supervised fine-tuning MLLMs of three sizes: xGen-MM-inst.-4B and LLaVA-NeXT-Vicuna-7B/13B. Empirical results show that CompCap-118K significantly enhances MLLMs' understanding of CIs, yielding average gains of 1.7%, 2.0%, and 2.9% across eleven benchmarks, respectively.

caption, large language model, machine learning, (19 more...)

2412.05243

Country:

North America > Mexico (0.14)
North America > The Bahamas (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
(122 more...)

Genre: Research Report > New Finding (0.87)

Industry:

Government (0.93)
Transportation > Passenger (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Perlaza, Samir M., Zou, Xinying

The Generalization Error of Machine Learning Algorithms

arXiv.org Artificial IntelligenceNov-18-2024

In this paper, the method of gaps, a technique for deriving closed-form expressions in terms of information measures for the generalization error of machine learning algorithms is introduced. The method relies on two central observations: $(a)$~The generalization error is an average of the variation of the expected empirical risk with respect to changes on the probability measure (used for expectation); and~$(b)$~these variations, also referred to as gaps, exhibit closed-form expressions in terms of information measures. The expectation of the empirical risk can be either with respect to a measure on the models (with a fixed dataset) or with respect to a measure on the datasets (with a fixed model), which results in two variants of the method of gaps. The first variant, which focuses on the gaps of the expected empirical risk with respect to a measure on the models, appears to be the most general, as no assumptions are made on the distribution of the datasets. The second variant develops under the assumption that datasets are made of independent and identically distributed data points. All existing exact expressions for the generalization error of machine learning algorithms can be obtained with the proposed method. Also, this method allows obtaining numerous new exact expressions, which improves the understanding of the generalization error; establish connections with other areas in statistics, e.g., hypothesis testing; and potentially, might guide algorithm designs.

artificial intelligence, equality, machine learning, (17 more...)

2411.1203

Country:

Europe > Austria > Vienna (0.14)
North America > Canada (0.14)
Europe > Greece > Attica > Athens (0.04)
(14 more...)

Genre: Research Report (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.45)

Daunas, Francisco, Esnaola, Iñaki, Perlaza, Samir M., Poor, H. Vincent

Asymmetry of the Relative Entropy in the Regularization of Empirical Risk Minimization

arXiv.org Machine LearningOct-9-2024

The effect of relative entropy asymmetry is analyzed in the context of empirical risk minimization (ERM) with relative entropy regularization (ERM-RER). Two regularizations are considered: $(a)$ the relative entropy of the measure to be optimized with respect to a reference measure (Type-I ERM-RER); or $(b)$ the relative entropy of the reference measure with respect to the measure to be optimized (Type-II ERM-RER). The main result is the characterization of the solution to the Type-II ERM-RER problem and its key properties. By comparing the well-understood Type-I ERM-RER with Type-II ERM-RER, the effects of entropy asymmetry are highlighted. The analysis shows that in both cases, regularization by relative entropy forces the solution's support to collapse into the support of the reference measure, introducing a strong inductive bias that can overshadow the evidence provided by the training data. Finally, it is shown that Type-II regularization is equivalent to Type-I regularization with an appropriate transformation of the empirical risk function.

equality, erm-rer problem, probability measure, (16 more...)

2410.02833

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > New York > New York County > New York City (0.04)
Europe > France > Provence-Alpes-Côte d'Azur (0.04)
(13 more...)

Genre: Research Report (0.49)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.92)

Vida, Karina, Damken, Fabian, Lauscher, Anne

Decoding Multilingual Moral Preferences: Unveiling LLM's Biases Through the Moral Machine Experiment

arXiv.org Artificial IntelligenceJul-21-2024

Large language models (LLMs) increasingly find their way into the most diverse areas of our everyday lives. They indirectly influence people's decisions or opinions through their daily use. Therefore, understanding how and which moral judgements these LLMs make is crucial. However, morality is not universal and depends on the cultural background. This raises the question of whether these cultural preferences are also reflected in LLMs when prompted in different languages or whether moral decision-making is consistent across different languages. So far, most research has focused on investigating the inherent values of LLMs in English. While a few works conduct multilingual analyses of moral bias in LLMs in a multilingual setting, these analyses do not go beyond atomic actions. To the best of our knowledge, a multilingual analysis of moral bias in dilemmas has not yet been conducted. To address this, our paper builds on the moral machine experiment (MME) to investigate the moral preferences of five LLMs, Falcon, Gemini, Llama, GPT, and MPT, in a multilingual setting and compares them with the preferences collected from humans belonging to different cultures. To accomplish this, we generate 6500 scenarios of the MME and prompt the models in ten languages on which action to take. Our analysis reveals that all LLMs inhibit different moral biases to some degree and that they not only differ from the human preferences but also across multiple languages within the models themselves. Moreover, we find that almost all models, particularly Llama 3, divert greatly from human values and, for instance, prefer saving fewer people over saving more.

llama 3, moral bias, young fit young fit, (13 more...)

2407.15184

Country:

North America > The Bahamas (0.14)
Asia > Macao (0.04)
Europe > Portugal (0.04)
(80 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Transportation (0.47)
Information Technology (0.46)
Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceJul-2-2024

Multilingual Trolley Problems for Language Models

Jin, Zhijing, Levine, Sydney, Kleiman-Weiner, Max, Piatti, Giorgio, Liu, Jiarui, Adauto, Fernando Gonzalez, Ortu, Francesco, Strausz, András, Sachan, Mrinmaya, Mihalcea, Rada, Choi, Yejin, Schölkopf, Bernhard

As large language models (LLMs) are deployed in more and more real-world situations, it is crucial to understand their decision-making when faced with moral dilemmas. Inspired by a large-scale cross-cultural study of human moral preferences, "The Moral Machine Experiment", we set up the same set of moral choices for LLMs. We translate 1K vignettes of moral dilemmas, parametrically varied across key axes, into 100+ languages, and reveal the preferences of LLMs in each of these languages. We then compare the responses of LLMs to that of human speakers of those languages, harnessing a dataset of 40 million human moral judgments. We discover that LLMs are more aligned with human preferences in languages such as English, Korean, Hungarian, and Chinese, but less aligned in languages such as Hindi and Somali (in Africa). Moreover, we characterize the explanations LLMs give for their moral choices and find that fairness is the most dominant supporting reason behind GPT-4's decisions and utilitarianism by GPT-3. We also discover "language inequality" (which we define as the model's different development levels in different languages) in a series of meta-properties of moral decision making.

arxiv preprint arxiv, llm, preprint arxiv, (16 more...)

2407.02273

Country:

South America > Venezuela (0.04)
South America > Uruguay (0.04)
South America > Peru (0.04)
(158 more...)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Industry:

Automobiles & Trucks (0.68)
Transportation > Ground (0.46)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)